Create a Physical Data Service from a Delimited File

This page last changed on Mar 19, 2008.

eDocs Home > BEA AquaLogic Data Services Platform Documentation > Data Services Developer's Guide > Contents

How To Create a Physical Data Service from a Delimited File

Spreadsheets offer a highly adaptable means of storing and manipulating information, especially information which needs to be changed quickly. You can easily turn such spreadsheet data into a data services.

Spreadsheet documents are often referred to as CSV files, standing for comma-separated values. Although CSV is not a typical native format for spreadsheets, the capability to save spreadsheets as CSV files is very common.

You can use the the physical data service creation wizard to:

Select a delimited file as the Data Source type.
Select either a schema file or a file with delimited data.
Specify whether the information has a header or not.
Specify delimiter.
Specify a fixed width value for each column.

Physical Data Service Creation Wizard

The following topics cover the actions necessary to create physical data services from delimited files:

Setting Up the Physical Data Service Creation Wizard
Specifying Delimited File Information
Setting Properties for New Library Functions
Verifying Data Service Composition

Setting Up the Physical Data Service Creation Wizard

Physical data services are created using a wizard.

Physical Data Service Creation Wizard

Starting the Wizard

To start the physical data service creation wizard:

Right-click on your dataspace project or any folder in your project.
Choose New > Physical Data Service

Creating a New Physical Data Service

Specifying Delimited File Information

A Library data service based on delimited data requires:

Schema in your project and/or a
Location of the delimited data file

Import Delimited File Data Wizard

The schema and data file must be available in your dataspace.

Providing a Document Name, a Schema Name, or Both

There are several approaches to developing metadata around delimited information, depending on your needs and the nature of the source.

Provide a delimited document name only. If you supply the import wizard with the name of a valid CSV file, the wizard will automatically create a schema based on the columns in the document. All the columns will be of type string, although you can later modify the generated schema with more accurate type information. The generated schema will have the same name as the source file.
Providing a schema name only. This option is typically used when the source file is dynamic; for example, when data is streamed.
Providing both a schema and a document name. Providing a schema with a CSV file gives you the ability to more accurately type information in the columns of a delimited document.

Locating the CSV File

Using the import wizard you can browse to any file in your project. You can also import data from any CSV file on your system using an absolute path prepended with:

file:///

For example, on Windows systems you can access an XML file such as Orders.xml from the root C: directory using the following URI:

file:///<c:/home>/Orders.csv

On a UNIX system, you would access such a file with the URI:

file:///<home>/Orders.csv

Import Delimited Data Options

Header. Indicates whether the delimited file contains header data. Header data is located in the first row of the spreadsheet. If you check this option, the first row will not be treated as imported data.
Delimited or Fixed Width. Data in your file is either separated by a specific character (such as a comma) or is of a fixed width (such as 10 spaces). If the data is delimited, you also need to provide the delimited character. By default the character is a comma.

Supported Datatypes

The following datatypes are supported for delimited file metadata import operations:

XMLSchemaType.BASE64BINARY
XMLSchemaType.BOOLEAN
XMLSchemaType.DATE
XMLSchemaType.DATETIME
XMLSchemaType.DECIMAL
XMLSchemaType.DOUBLE
XMLSchemaType.FLOAT
XMLSchemaType.INT
XMLSchemaType.INTEGER
XMLSchemaType.LONG
XMLSchemaType.STRING
XMLSchemaType.SHORT

Additional Considerations

The number of delimiters in each row must match the number of header columns in your source minus one (# of columns-1). If subsequent rows contain more than the maximum number of delimiters (fields), subsequent use of the data service will not be successful.
If the delimited file has rows with a variable number of delimiters (fields), you can supply a schema that contains optional elements for the trailing set of extra elements.
Not all characters are handled the same way. Some charactters may need special escape sequences before spreadsheet data can be accessed at runtime.

Setting Properties for New Library Functions

This general topic applies to setting properties for all types of library data service functions.

Use the Review New Data Service Operations page to:

Change the function name.
Set the Public option (check if you want your function to be available to client applications).
Set the kind of function (in some cases only one option will be available).
Set the Primary option (check if you want your function to be the primary of its type).
In some cases this option may not be available.
Select a common XML namespace for the entire data service.
Set the target namespace.

The root element, which is read only, is also displayed.

Verifying Data Service Composition

On the Review New Data Service(s) page you can set, confirm or, optionally, change suggested data service names depending on the type of physical data service you are creating.

Default Physical Data Service Names

The nominated name for a new data service is, wherever possible, the same as the source object name. In some cases, however, names are adjusted to conform with XML naming conventions.

XML Name Conversion Considerations

About Automatic Data Service Name Changes

Name conflicts occur when there is a data service of the same name present in the target directory. Name conflicts are highlighted in red.

There are several situations where you will need to change the name of your data service:

There already is a data service of the same name in your application.
You are trying to create multiple data services with the same name.

Data services always have the file extension:

.ds

Document generated by Confluence on Apr 28, 2008 15:54